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Abstract 

For the seismic data with low SNRs, the first arrival automatic picking method is very important but difficult. In the paper, we 
proposed a new method based on the mutual information in information theory. The mutual information between signals and 
noises is zeros, thus random noises have less effects on first arrivals pickup. The paper compares the principle of STA/LTA, 
AIC, fractal dimension of three kinds with the proposed method for seismic data first-break picking method, and at the same 
times, the paper presents a detailed test and verification of the simulation data, and compares first-break picking accuracy and 
efficiency of the three algorithms through actual data with different S/N ratios. The results show that for the data with high S/N 
ratio, first break picking accuracy of these four methods is relatively high. When SNR decreases, first-arrival time that the 
proposed method picks has higher precision and good noise immunity. However, mutual information based method has lower 
efficiency and is limited by algorithm principle, it is difficult to separately pick first breaks for fractal dimension and AIC 
method. So it is a very good method to identify seismic events and determine preliminarily the time range of first breaks by the 
proposed method. 
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Introduction 

In seismic exploration, first-break picking is the task of determining, given a set of seismic traces, the on sets of the 
first signal arrivals as accurately as possible. In general, these arrivals are associated with the energy of refracted 
waves at the base of the weathering layer or to the direct wave that travels directly from the source to the receiver 
[5]. 

The accurate determination of the first arrivals onset first-break times is needed for calculating the static 
corrections, a fundamental stage of seismic data processing. Clearly, the effectiveness of reflection and 
refraction-based methods of static corrections depends on the picking-process reliability. At the same times, 
applications such as near-surface tomographic static corrections tomographic statics require rapid automated 
detection of the signal first. 

Generally, first-break quality is related to the near-surface structure, source type, and signal-to-noise ratio S/N 
conditions. As a consequence, the automated picking of first breaks can be a very difficult task if data area acquired 
in complex near-surfaces scenarios or if the S/N is low. Moreover, if the source wavelet is zero-phase as when 
vibroseis sources are used, the sweep correlation often produces side-lobes that arrive before the first break, thus 
making the picking process even more difficult. 

First arrival pickup of seismic waves, so far, has had a lot of methods. According to the criterion, first-arrival 
picking algorithms can be divided into several types of methods, including the coherence method [3, 4] 
cross-correlation method [2], neural network method [1] and fractal dimension methods [7]. For the coherence and 
neural network methods, some kinds of patterns are assumed for picking the first arrivals. Therefore, pattern 
recognition of this type is effective if a simple earth model exists to model the earth structure. However, this simple 
earth-model rarely matches the near-surface conditions when studied with the detail required of most modern 
surveys [6]. The advantages of the cross-correlation methods are that the algorithm is based on trace-by-trace 
evaluation of the first-arrival times and are considered to be most appropriate for the near-surface surveys. 

In the paper, we proposed a robust method of first-break picking for data sets with high noise levels through 
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theuse of the information theory on seismic records. Using synthetic shot records with various noise levels, we 
showed that the performance of this proposed method enhances first arrivals, which helps in picking them. This 
was particularly true when the noise level was high where picking on raw amplitudes completely fails. The 
method can be used to guide better the subsequent careful picking of first arrivals and requires one forward. In 
contrast to methods based on trace-by-trace picking that often fail to pick some traces, the proposed method 
automatically interpolates missing picks. 


Basics of Information Theory 

The average amount of information gained from a given discrete space X is the entropy Y , 

H(X) = -IP(x,.)logP(^,)(l) 

i 


If the log is taken to the base two, H is in units of bits. Entropy of information in information theory can be defined 
as follows. For given discrete probability space, which expressed information source and the information source 
defined random variable I . The mathematical expectation of I is entropy of information of information source, 
whose unit is bit/symbol. 

H(X ) = E[I(x)] = -X p(x t ) log p(x t ) (2) 

i 


The conventional mutual information has been defined as: 

/ {X,Y) = XJ f XY (x , }’)ln /™[ X : y ) dxdy (3) 
Jx \ x )Jy (A) 

where, f XY (x, y) is the joint probability density function (PDF), and f x (x) and f Y (y) are the marginal PDFs of 
variables X and Y , respectively. 

Mutual information can also be equivalently expressed as: 

I(X;Y)=H(Y)-H(Y/X) 

( 4 ) 

I(X;Y) = H(X) + H(Y) - H(XY) 


Actually, the mutual information is very difficult to calculate since the probabilities of variables X and Y are 
quite difficult to get. In the paper, we propose a new method to calculate the mutual information based on the 
recursive idea. In principle, the mutual information is a kind of measuring how dependent the variable of X are 
on the variable of Y . 

By making the assignment, [.?,<?] = [x(f),x(r + 7’)] , we can consider a general system (.S', Q) , then the uncertainty 
of measurement of q , given (.S', 0) , is 

H[Q\s i ) = -Y J P q \ s {q j I */) tog ^,(9; lh)( 5 ) 

j 

where P.^ \ q- \ y j is the probability that a measurement of q will yield q., given that the measured value of ,v , 
is s, . At the same time, the average uncertainty in a measurement of x at the time t + T can be shown given that 
x has been measured at time t , 

H(Q\S) = Y J P s {s i )H[Q\s i ) = H(S,Q)-H(S) (6) 

i 

where, 

H(S,Q) = -J J P^ s [q j ,s i ) log P qs [qj,Si)(7) 
hj 

H (<2) is the uncertainty of q , and // ( (/ 1 S ) is the uncertainty of q given a measurement of s . So the amount that 
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a measurement of s reduces the uncertainty of q is according to the equation (4) 

I{Q,S) = H{Q)-H(Q\S)(8) 

It is important that mutual information is not a function of the variables s and q , but that it is a functional of the 
joint probability distribution P . If s and q are the same to within the noise, then (.S'. Q) specifies the relative 

accuracy of the measurements in bits, i.e., how much information one measurement gives about a second 
measurement of the same variable. 

If S is a delayed image of Q , then a delay phase portrait gives the estimated joint distribution P sq and 1 is a 
statistic calculated on the portrait that evaluates how redundant the second axis is. 

New Method for First Arrivals 

The calculation method of mutual information of the arbitrary A and B two channels on seismic record is 
performed with the software "Smart Signal Processing" developed by Prof. Ming-Yue ZHAI at North China Electric 
Power University, and steps are as follows: 

(1) : Firstly, we should intercept the seismic wave whose length is L on A channel, taken down: (x(n )} n=0 , , 

L is the length of sequences. Then the range of sequence x is divided into equal parts which M is interval, at 
last we count the sampling point that falls into the every interval and calculate the probability distribution p(x) 
of sequence {4«)}„ =0 .i,...l-i • 

{ x(tl) } 

(2) : Secondly, Jn=o,i,..x-i whose length is L slides on channel B with a certain step. The mutual information 

{ xivi) 1 

is taken down when it slides one time, and this mutual information is the one of the n=u,i.. /.-i anc | 

{ y(n + d )} n=0 ! . For the channel B , we will fill the insufficient parts with random sequences, and then we can 

get the mutual information 1(d) . 

(3) : For, 1(d ) , d = 0,1 The maximal (d) is /(D max ) . Now, we think this is the first arrival time. The value of F 

is more than 100 . The value of M is about 16 to 64. 

In the foresaid algorithm, the seismic mutual information of channel A and channel B need to be calculated. 
Because calculation is very complicated, the paper improves the algorithm in order to reduce the amount of 
calculation. 

Through the calculation of mutual information, we get the maximal value of the mutual information. The time that 
it corresponds is first arrival time of seismic waves. This method is more accurate than the ratio of energy method. 
However, it needs to calculate mutual information from the first channel to the last channel, whose amount of 
calculation is very huge. But the advantage is that mutual information is accurate and robust, because random 
noises are uncorrelated with signals of interesting and thus the mutual information is zero for such case. 

We can explain it as a measure of the amount of information one random variable contains about another random 
variable, thus it is the reduction in the uncertainty of one random variable due to the knowledge of the other. 
Mutual information is not an invariant measure between random variables because it contains the marginal 
entropies. Normalized Mutual Information is a better measure of the "prediction" that one variable can do about 
the other 

Applications to Simulated Signals 

In the section, we applied the proposed method to a seismic profile, which is generated by the time-delayed line 
models (TDF). The wavelet used in TDF model is Ricker wavelet, with center frequency 10 Hz . The simulated 
data without noise is illustrated in Fig. 1, and the noisy version of the simulated data is plotted in Fig. 2. As an 
example. Fig. 3 illustrated the results of the proposed method. From the simulations, we can see that the proposed 
method can obtain the precise first arrivals even under very low SNR. 
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From above-mentioned figures we can know that, when SNR is very high, the proposed methods can pick up the 
first arrival very well. With the reduction of SNR, error of ratio of energy method changes larger, and a lot of bad 
points appearance. We cannot accurately pick up the first arrival. When SNR is 5 dB , the method based on the 
mutual information can calculate the mutual information from the first channel to the end channel. At the same 
times, it only calculates a part of mutual information. For calculating the part of mutual information, amount of 
calculation of mutual information are the same. To evaluate the performances of the proposed method under 
different SNRs, we applied it to the simulated data with different SNRs. The results are illustrated in Fig. 4. From 
such figure, we can see that the errors are very low. 

Conclusions 

First arrivals pickup is a very important issue in seismic data processing. In the paper, we proposed a new method 
with the help of the mutual information. In such method, there is no any assumption for the data. Therefore, the 
proposed method can be applied to any seismic data. Especially, the mentioned method can work very well under 
very low SNRs environments, because the mutual information between data and noise is zero, and thus noises can 
affect arrivals pickup as less as possible. 


simulated seismic data without noise in seismic profile 



FIGURE 1. THE SIMULATED SEISMIC PROFILE WITH TDL MODEL WITHOUT NOISE 

processed data with SNR= 0 
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FIGURE 2. THE SIMULATED SEISMIC PROFILE WITH NOISE 
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Results of different SNR 



Results of different SNR 


Results of different SNR 
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FIGURE 3. AN EXAMPLE OF FIRST ARRIVAL PICKUP WITH THE PROPOSED METHOD 
first arrivals pickup errors with the proposed method 
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FIGURE 4. ERRORS FOR THE FIRST ARRIVAL PICKUP WITH THE PROPOSED METHOD 
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